Restart psi-secrets after refresh so serve reloads fresh cache by jdoss · Pull Request #35 · quickvm/psi

jdoss · 2026-04-17T20:45:58Z

Summary

Every time psi-{provider}-refresh.timer fires, setup re-registers secrets via delete+create through the Podman API, which assigns fresh hex IDs. Setup writes those new IDs to the on-disk cache file and the prune step from PR #32 drops the old entries. But serve holds the OLD cache in memory from its last startup and never picks up the new file state — so every lookup after the first refresh goes straight to the provider, and the cache does no work until an operator manually restarts psi-secrets.

Observed on the test server

1554 secret lookups over 30 minutes, zero cache hits. All source: provider. The refresh timer had fired 7 minutes earlier and silently broke the cache. Victoria Logs flagged it via the sheer volume of provider-source events.

Fix

Add a second ExecStart to the refresh wrapper that runs systemctl try-restart psi-secrets.service after setup completes. try-restart is a no-op if serve is not currently active, so this is safe on hosts that have intentionally stopped psi-secrets.

There is a brief (~30s on HSM) lookup-fails-to-cache window during the serve restart, but this happens at most once per cache.refresh_interval (default 1h) instead of never.

Test plan

pytest tests/test_unitgen.py — new regression test test_restarts_psi_secrets_so_serve_reloads_the_fresh_cache; all 52 unitgen tests pass.
ruff check / ty check — clean.
Deploy to test server, run psi systemd install to regenerate the wrapper, fire the refresh, confirm Victoria Logs shows source: cache entries going forward.

Remaining issue (separate PR)

Victoria Logs classifies all PSI log entries as level: error because loguru writes to stderr and conmon maps stderr → PRIORITY: 3. Separate from this PR — a follow-up will split INFO/DEBUG to stdout (PRIORITY 6).

Every time psi-{provider}-refresh.timer fires, setup re-registers secrets via delete+create through the Podman API, which assigns fresh hex IDs. Setup writes those new IDs to the on-disk cache file and the prune step from PR #32 drops the old entries. But serve holds the OLD cache in memory from its last startup and never picks up the new file state — so every lookup after the first refresh goes straight to the provider, and the cache does no work until an operator manually restarts psi-secrets. Observed on the test server: 1554 secret lookups over 30 minutes, zero cache hits. All source=provider. The refresh timer had fired 7 minutes earlier and silently broke the cache. Add a second ExecStart to the refresh wrapper that runs systemctl try-restart psi-secrets.service after setup completes. try-restart is a no-op if serve is not currently active, so this is safe on hosts that have intentionally stopped psi-secrets. There is a brief (~30s on HSM) lookup-fails-to-cache window during the serve restart, but this happens at most once per cache.refresh_interval (default 1h) instead of never.

jdoss merged commit 89e373c into master Apr 17, 2026
1 of 2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Restart psi-secrets after refresh so serve reloads fresh cache#35

Restart psi-secrets after refresh so serve reloads fresh cache#35
jdoss merged 1 commit intomasterfrom
fix/refresh-restarts-serve

jdoss commented Apr 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jdoss commented Apr 17, 2026

Summary

Observed on the test server

Fix

Test plan

Remaining issue (separate PR)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant